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Parasites of the genus Eimeria infect a wide range of vertebrate hosts, including chickens. We have recently reported a 
comparative analysis of the transcriptomes of Eimeria acervulina, Eimeria maxima and Eimeria tenella, integrating ORESTES 
data produced by our group and publicly available Expressed Sequence Tags (ESTs). All cDNA reads have been assembled, 
and the reconstructed transcripts have been submitted to a comprehensive functional annotation pipeline. Additional 
studies included orthology assignment across apicomplexan parasites and clustering analyses of gene expression profiles 
among different developmental stages of the parasites. To make all this body of information publicly available, we con- 
structed the Eimeria Transcript Database (EimeriaTDB), a web repository that provides access to sequence data, annotation 
and comparative analyses. Here, we describe the web interface, available sequence data sets and query tools implemented 
on the site. The main goal of this work is to offer a public repository of sequence and functional annotation data of 
reconstructed transcripts of parasites of the genus Eimeria. We believe that EimeriaTDB will represent a valuable and 
complementary resource for the Eimeria scientific community and for those researchers interested in comparative genomics 
of apicomplexan parasites. 

Database URL: http://www.coccidia.icb.usp.br/eimeriatdb/ 



Introduction 

Coccidian parasites infect a wide range of vertebrate hosts 
and cause many diseases of veterinary and human import- 
ance. Within this group, parasites of the genus Eimeria infect 
many species of wild and domestic hosts, including poultry. 
Seven distinct Eimeria species may infect chickens, causing 
enteric diseases that lead to diarrhoea, malabsorption, im- 
paired weight gain and higher susceptibility to opportunistic 
diseases. The economic impact of such diseases is reflected by 
direct costs associated with the reduced productivity in af- 
fected flocks and by indirect costs related to the preventive 



use of anti-coccidial drugs and/or vaccines (1). The produc- 
tion losses because of coccidiosis have been estimated at US$ 
2400 million per annum worldwide (2). Eimeria parasites are 
easily propagated through oral infections of experimental 
animals under controlled conditions, thus permitting to per- 
form molecular studies. The genome size of Eimeria tenella, 
the model species, comprises ~55-60 Mb distributed in 14 
chromosomes (3). The complete sequence of E. tenella 
chromosome 1 (4) and a whole-genome sequence are pub- 
licly available on GeneDB (5) and EuPathDB (6) databases. In 
addition, a draft sequence of the Eimeria maxima genome 
has been recently reported (7). The transcriptome of Eimeria 
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parasites has been assessed using conventional Expressed 
Sequence Tag (EST) (8-13) and ORESTES (open reading 
frame ESTs) reads (14). In addition, Amiruddin eta/. (15) ob- 
tained full-length cDNA sequences of 443 E. tenella genes. 
The cDNA libraries used in all these works have been derived 
mainly from the most accessible developmental stages, 
including oocysts in different phases of sporulation, sporo- 
zoites and first- and second-generation merozoites. Based on 
genomic gene prediction and transcriptome assembly data, 
the transcriptome complexity of E. tenella has been esti- 
mated as circa 8700 genes (14). We have recently reported 
an integrated and comparative analysis of the transcriptome 
of Eimeria acervulina, E. maxima and E. tenella, including 
ORESTES data produced by our group and publicly available 
ESTs (14). All cDNA reads were assembled and the recon- 
structed transcripts submitted to a comprehensive functional 
annotation pipeline. Comparative studies included orthology 
assignment across apicomplexan parasites and clustering 
analyses of gene expression profiles among different devel- 
opmental stages of the parasites. To make all this body of 
information freely available to the scientific community, we 
constructed the Eimeria Transcript Database (EimeriaJDB), a 
web repository that provides access to sequence data, anno- 
tation and comparative analyses. Here, we describe the web 
interface, available sequence data sets and query tools im- 
plemented on the site. The main goal of this work is to offer 
a public repository of sequence and functional annotation 
data of reconstructed transcripts of parasites of the genus 
Eimeria. 

Data content of current release 

EimeriaJDB v. 1.1 contains transcript sequences of E. acervu- 
lina, E. maxima and E. tenella, reconstructed from EST and 
ORESTES data, as previously described (14). In total, the cur- 
rent version comprises data sets of 3413, 3426 and 8700 
assembled transcripts, respectively. The cDNA reads are 
derived from several developmental stages of the parasites, 
including unsporulated oocysts, sporoblast-phase oocysts, 
sporulated oocysts, sporozoites and first- and second- 
generation merozoites. EimeriaJDB comprises assembled 
and unassembled data, annotation of individual assembled 
sequences and global analysis of each transcriptome data set. 
Digital expression data, based on the frequency of reads be- 
longing to each assembled transcript, are also available. 

Database organization and 
implementation 

The assembled transcripts of the three Eimeria species were 
submitted to an annotation pipeline constructed with 
EGene2, a new version of the platform (16) that includes 
annotation components (available on request). Briefly, the 



pipeline consisted in finding all potential ORFs and translat- 
ing into the corresponding products. We used an arbitrarily 
chosen ORF length of at least 50 amino acids. All protein 
products were inspected for sequence similarity using 
BLASTp (17) against the NCBI non-redundant protein data- 
base, protein domains using RPS-BLAST against Conserved 
Domains Database (CDD) (18), protein motifs with 
InterProScan (19), signal peptide and transmembrane 
domain prediction using Phobius (20) and 
Glycosylphosphatidyl inositol (GPI) anchoring cleavage site 
prediction using DGPI (Kronegg and Buloz, unpublished re- 
sults, downloaded from http://129.194.185.165/dgpi/ on 
March 2008). Finally, using InterPro IDs, we mapped and 
quantified Gene Ontology (GO) terms (21) using a GO 
slim file, a subset of GO terms. Also, all proteins were func- 
tionally classified using KOG (22) and eggNOG (23) data- 
bases of orthology and mapped onto the Kyoto 
Encyclopedia of Genes and Genomes (KEGG) Pathway (24) 
database. We also performed an integrated orthology ana- 
lysis of the translated products of the three Eimeria species 
with data sets of proteins predicted from genomes of six 
apicomplexan parasites: Toxoplasma gondii, Plasmodium 
falciparum, Neospora caninum, Babesia bovis, Theileria 
annulata and Cryptosporidium parvum. For this task, we 
used the programs InParanoid (25) and M u It i Paranoid 
(26), as previously described (14). The extensible mark-up 
language annotation file generated by EGene was used to 
automatically populate a MySQL database using an 
in-house script. The web interface was developed using 
PHP, HTML and JavaScript languages, and it was integrated 
with the database through a set of in-house Perl scripts. All 
annotation data were integrated with the Generic Genome 
Browser (GBrowse), a genome viewer that is widely used 
for visualization of sequence annotation (27). EimeriaJDB is 
linked to the NCBI BioProject Database under the accession 
codes PRJNA81161, PRJNA81163 and PRJNA81165. The re- 
pository is publicly available at http://www.coccidia.icb.usp. 
br/eimeriatdb/. Publications that use this database should 
cite the aforementioned URL and this publication. 

Data analysis tools 

EimeriaJDB offers a variety of services, including a local 
BLAST engine, a database-querying page, annotation 
pages of individual transcripts, global analyses of whole 
data sets and a data download page. Table 1 depicts all re- 
sources provided in the website and the respective descrip- 
tions. The interface presents a set of tabs at the main page, 
each one redirecting to a specific service page (Figure 1A). 

BLAST 

A local BLAST service is available, and searches can be per- 
formed against many Eimeria databases, including 
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Table 1. Resources available at the Eimeria Transcript Database (EimeriaJDB) web site 



Resource 


Features 


Online analysis tools 


Local BLAST engine 




Similarity searches against different E. tenella genome assembly versions, assembled and unassembled EST/ 




ORESTES of E. acervulina, E. maxima and E. tenella 


Relational database 


Queries using sequence IDs, keywords and evidence results. Access to annotation results of individual 




sequences 


Annotation 




Individual annotation 


Evidence 




Orthology analysis with other apicomplexans, BLAST x nr, RPS-BLAST x CDD, InterproScan searches, SignalP, 




TMHMM, Phobius, DGPI analyses, GO term mapping, functional classification using KOG and eggNOG and 




pathway mapping using KEGG. Expression analysis, when available, displayed in a graphic 




Downloads 




DNA transcript sequence, DNA ORF sequences and protein sequences 




Annotation reports 




All sequences annotated with or without ORF selection. Available formats: Feature Table, Feature Table with 




Artemis additional feature keys and GFF3. Graphical visualization of annotation data on GBrowse. 


Global annotation 


GO term mapping 




Claccif iratinn anrl ni ian+if iratinn r\~f accom h» I /"DM Ac in+/~\ fnO tormc 
\_laobl I H_d UUI 1 dilU L]Udl l LI 1 ILd l\(J\ 1 Ul cobcl 1 lUltrU t-L-MM/no IIIIU \J\J Itrilllb 




Orthology analysis 




Classification and quantification of assembled cDNAs into functional groups of KOG and eggNOG 




Pathway mapping 




Classification and quantification of assembled cDNAs into KEGG's Metabolic Pathway Classes 


Downloads 


cDNA products (ORFs >50), assembled cDNA sequences, annotation reports in Feature Table, Feature table 




with Artemis additional feature keys and GFF3 formats. 



genomic, cDNA and mitochondrial sequences. Genomic se- 
quences comprise shotgun reads and several assembly ver- 
sions from the Wellcome Trust Sanger Institute (ftp://ftp. 
sanger.ac.uk/pub/pathogens/Eimeria/tenella/). Expressed se- 
quences include assembled cDNAs of E. tenella, E. acervu- 
lina and E. maxima. In the case of E. tenella, the database 
contains an assembly constructed from a mixture of 
ORESTES and EST reads (14). For E. acervulina and 
E. maxima, the current version of the database contains 
assemblies obtained with ORESTES reads only. All programs 
of BLAST package can be used: blastn, blastp, blastx, tblastx 
and tblastn. Once a given assembled cDNA hit is identified, 
the user can consult the relational database to inspect the 
corresponding annotation using the sequence ID. 

Querying the database 

The Search Database section allows users to perform custo- 
mized queries to EimeriaJDB (Figure 1B). The database in- 
tegrates data from the three Eimeria species and results 
from all programs used to collect evidence. If the user al- 
ready knows the sequence ID, for example 'Eten_0011', 
then the corresponding annotation can be directly 
retrieved. Searches can also be performed using single or 
multiple query terms. Query terms include product names 



(e.g. hexokinase, serine protease, microneme protein and 
so forth), descriptions and IDs derived from InterPro, KOG 
(e.g. KOG1696; 60S ribosomal protein L19), eggNOG (e.g. 
euNOG10377; transporter protein) and KEGG (e.g. citrate 
cycle; K01647, citrate synthase; large subunit ribosomal 
protein L19e and so forth). Queries can be restricted 
using different sets of radio buttons to a specific Eimeria 
species, or according to different types of evidence. In the 
latter case, search results can be restricted to only 
those sequences presenting a given subset of results. For 
instance, a user can specify 'receptor' as a keyword and 
restrict the results to the sequences presenting positive re- 
sults for transmembrane domains and signal peptide. In this 
case, the sequences retrieved by the search are most prob- 
ably related to membrane bound proteins, such as 
G-protein coupled and other receptors. As a result of the 
query, the user obtains a list of sequences fulfilling the 
search criteria, with specific links to the respective annota- 
tion pages. 

Exploring transcript annotation 

Annotation is provided in three distinct formats: Feature 
Table (FT), extended FT and Generic Feature Format (GFF) 
3 (Figure 1C). FT is the annotation format and vocabulary 
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Figure 1. Screenshots of some resources available at EimeriaTDB. The home page (A) contains tabs redirecting to specific service 
pages. The search page (B) allows querying the database using sequence IDs or keywords. Queries can be restricted to specific 
Eimeria species or according to different types of evidence. The annotation page (C) provides access to sequence and annotation 
data, orthology analysis within apicomplexan organisms and a link to the respective GBrowse (D) screen. When available, 
expression data are displayed in a graphic (E). 



terms adopted by the main sequence repositories (DDBJ/ 
EMBL/GenBank). A definition of FT is available at the 
International Nucleotide Sequence Database Collaboration 
site (http://www.insdc.org/files/feature_table.html). We also 
provide an extended FT version, which includes some specific 
tags that are not officially included in the FT specification, 
but they are compatible with Artemis annotation and edit- 
ing tool (28). For GFF3, we followed the definition available 
at the Sequence Ontology Project (http://www.sequenceon- 
tology.org/gff3.shtml). The annotation files are available 
with and without automatic ORF selection (see 'Evidence 
Annotation' section), and all results (selected and unselected 



ORFs) are available for inspection. Also, annotation can be 
graphically visualized with GBrowse through an available 
link (Figure 1D). 



Orthology analysis across 
apicomplexan parasites 

Orthologues identified in other Apicomplexa organisms are 
listed in a specific table (Figure 1C), which displays the cor- 
responding sequence IDs, KOG IDs and, when available, 
BLAST hits. Also, links to the amino acid sequences of the 
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orthologues are provided. Orthologues of other Eimeria 
species are cross-referenced through links to the respective 
annotation page at EimeriaJDB. 

Gene expression profiles 

When available, we provide a chart displaying the expres- 
sion profile of the gene across different Eimeria develop- 
mental stages (Figure 1E). The expression data of each 
stage are based on the normalized number of reads com- 
prising each assembled sequence according to their respect- 
ive source (29). Briefly, we used in-house scripts (available 
on request) to convert the CAP3 assembly files into spread- 
sheets that list the number of reads belonging to each 
assembled sequence with regard to their respective devel- 
opmental stage, as previously described by Novaes et al. 
(14). The corresponding P-value and status of expression 
(differentiated/non-differentiated) are also displayed. 

Evidence-based annotation 

The final part of the annotation page provides descriptions 
and links for all program results that give support to a func- 
tion for the putative gene (Figure 2A). By clicking on the 
respective link, the user is redirected to the specific page of 
each program result, such as sequence alignments and 
graphical results. In addition, links to mapped GO terms, 
functional classification using KOG and eggNOG databases 
and pathway mapping on KEGG are also presented. When 
available, KEGG results contain links to the corresponding 
KEGG Orthology (KO) page on KEGG's site and pathway 
image (Figure 2B). Stored results are also available for the 
following programs: BLAST (Figure 2C), R PS-BLAST, 
InterproScan (Figure 2D), SignalP, TMHMM, Phobius and 
DGPI. Our annotation pipeline has automatically selected 
the most probable coding ORF, based on weighted criteria 
on a set of bioinformatics analysis results for each ORF of an 
assembled transcript. Nevertheless, if the user wants to in- 
spect the results of all ORFs, we provide a link entitled 'evi- 
dence for all predicted ORFs' at the bottom of the 
annotation page. 

Global analyses 

A specific section of the site provides both qualitative and 
quantitative analyses for the whole sets of translated prod- 
ucts of E. acervulina, E. maxima and E. tenelia. Analyses 
include GO term mapping, orthology functional classifica- 
tion using KOG and eggNOG databases and pathway map- 
ping using KEGG. All annotated proteins are mapped onto 
GO terms using a GO slim file comprising a subset of GO 
terms. The results are presented in a composite table com- 
prising the three ontology domains, with the respective GO 
slim terms and sequence counts. If the user clicks on the GO 



term itself, the page is redirected to the AmiGO browser 
(30), showing the corresponding term description. Also, 
there are links to all sequences whose products have been 
mapped to the particular GO term. All translated protein 
sequences are also mapped onto KEGG Orthology data- 
base, and the corresponding pathways are identified. The 
KEGG Pathway classes are listed on a table with the respect- 
ive sequence counts (Figure 2E), and distribution is depicted 
in a pie chart. By clicking on a KEGG Pathway Class link (e.g. 
metabolism), an expanded list of subclasses is displayed. 
Each subclass presents the corresponding number of 
classified sequences and contains a link that opens up a 
page with the list of proteins (with links to BLAST 
alignments), Orthology Group (KO number), KO descrip- 
tions, E.C. numbers and KEGG pathways. Each pathway 
provides a link to the corresponding KEGG pathway 
image, with the respective query protein highlighted in 
a red-labelled box (Figure 2B). Finally, the transcript 
products are also mapped onto KOG and eggNOG data- 
bases. In both cases, the results are displayed in a table 
listing the functional categories and the respective 
number of sequences classified in each one (Figure 2F). By 
clicking on the one-letter functional class code, a page dis- 
playing a list of all proteins classified in this specific cat- 
egory is presented, with links to the corresponding BLAST 
alignments. 

Retrieving data 

Each sequence annotation page provides links to the re- 
spective transcript DNA sequence and translated product 
in FASTA format, plus annotation data in Feature Table 
and GFF3 formats. Also, the Downloads section allows the 
user to download tarball compressed files that comprise 
global data sets for each of the three Eimeria species, 
including nucleotide and amino acid sequences and anno- 
tation files. 

Future directions 

By the time our group had described the transcriptome 
of three Eimeria species that infect chickens, Amiruddin 
et al. (15) reported an initiative of full-length transcript 
sequencing in E. tenelia, comprising the entire sequence 
of 443 E. tenelia transcripts and corresponding to ~5% of 
the parasite transcriptome. To our knowledge, some 
other groups are conducting RNAseq studies in different 
developmental stages of the parasite. We intend to 
incorporate all publicly available transcriptome sequence 
data in future releases of the database, thus providing an 
increasingly higher coverage of Eimeria reconstructed tran- 
scripts. We currently use a relatively simple database 
schema for EimeriaJDB. However, a newly developed 
version of the EGene platform will perform automatic 
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Figure 2. Evidence-based annotation. Each annotation page provides specific links to different bioinformatics analyses (A), 
including sequence mapping onto KEGG pathways (B), BLAST similarity (C) and InterPro motif searching (D). Global analyses 
include functional classification into KEGG Metabolic Pathway (E) and KOG orthology (F) classes. 
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annotation using Chado (31), the relational database 
schema that underlies the Generic Model Organism 
Database (GMOD) applications. We intend to incorporate 
this schema and associated annotation in future releases of 
EimeriaJDB. 

EuPathDB has recently incorporated genomic and EST 
data of E. tenella, offering a new perspective of compara- 
tive analysis across apicomplexan and other protozoan 
parasites. EimeriaJDB, by mainly focusing on transcript re- 
construction and annotation, may represent a valuable and 
complementary resource for the Eimeria scientific commu- 
nity, and for those researchers interested in comparative 
genomics of apicomplexan parasites. 

Acknowledgements 

The authors wish to thank the colleagues from the Eimeria 
research community for sharing transcript sequences, 
including unpublished data. 

Funding 

Sao Paulo Research Foundation (FAPESP 03/14031-3); 
National Council for Scientific and Technological Develop- 
ment (CNPq, Brazil) (to J.N.) and the work presented herein 
formed part of her PhD thesis; CNPq and FAPESP (to L.T.R.) 
and the work presented herein formed part of his MSc the- 
sis; Productivity-in-Research fellowships from CNPq (to A.G., 
A.M.B.N.M. and A.M.D.). Funding for open access charge: 
Coordination for the Improvement of Higher Education 
Personnel (CAPES PROEX 1788/2012). 

Conflict of interest. None declared. 



References 

1. Williams, R.B. (1999) A compartmentalised model for the estimation 
of the cost of coccidiosis to the world's chicken production indus- 
try. Int. J. Parasitol., 29, 1209-1229. 

2. Shirley,M.W., Smith,A.L. and Tomley,F.M. (2005) The biology of 
avian Eimeria with an emphasis on their control by vaccination. 
Adv. Parasitol., 60, 285-330. 

3. Shirley,M.W. (2000) The genome of Eimeria spp., with special ref- 
erence to Eimeria tenella — a coccidium from the chicken. Int. J. 
Parasitol., 30, 485-493. 

4. Ling,K.H., Rajandream,M.A., Rivailler,P. et at. (2007) Sequencing 
and analysis of chromosome 1 of Eimeria tenella reveals a unique 
segmental organization. Genome Res., 17, 311-319. 

5. Logan-Klumpler,F.J., De Silva,N., Boehme,U. eta/. (2012) GeneDB— 
an annotation database for pathogens. Nucleic Acids Res., 40, 
D98-D108. 

6. Aurrecoechea,C, Heiges,M., Wang,H. et al. (2007) ApiDB: inte- 
grated resources for the apicomplexan bioinformatics resource 
center. Nucleic Acids Res., 35, D427-D430. 



7. Blake,D.P., Alias,H., Billington,K.J. et al. (2012) EmaxDB: availability 
of a first draft genome sequence for the apicomplexan Eimeria 
maxima. Mol. Biochem. Parasitol., 184, 48-51. 

8. Li, L., Brunk,B.P., KissingerJ.C. et al. (2003) Gene discovery in the 
apicomplexa as revealed by EST sequencing and assembly of a com- 
parative gene database. Genome Res., 13, 443-454. 

9. Schwarz,R.S., Fetterer,R.H., Rosenberg,G.H. et al. (2010) Coccidian 
merozoite transcriptome analysis from Eimeria maxima in compari- 
son to Eimeria tenella and Eimeria acervulina. J. Parasitol., 96, 
49-57. 

10. Miska,K.B., Fetterer,R.H. and Barfield,R.C. (2004) Analysis of tran- 
scripts expressed by Eimeria tenella oocysts using subtractive hy- 
bridization methods. J. Parasitol., 90, 1245-1252. 

11. Miska,K.B., Fetterer,R.H. and Rosenberg,G.H. (2008) Analysis of 
transcripts from intracellular stages of Eimeria acervulina using ex- 
pressed sequence tags. J. Parasitol., 94, 462-466. 

12. Ng,S.T., Sanusi Jangi,M., Shirley,M.W. et al. (2002) Comparative 
EST analyses provide insights into gene expression in two asexual 
developmental stages of Eimeria tenella. Exp. Parasitol., 101, 
168-173. 

13. Wan,K.L, Chong,S.P., Ng,S.T. et al. (1999) A survey of genes in 
Eimeria tenella merozoites by EST sequencing. Int. J. Parasitol., 
29, 1885-1892. 

14. NovaesJ., Rangel,L.T., Ferro,M. et al. (2012) A comparative tran- 
scriptome analysis reveals expression profiles conserved across 
three Eimeria spp. of domestic fowl and associated with multiple 
developmental stages. Int. J. Parasitol., 42, 39-48. 

15. Amiruddin,N., Lee,X.W., Blake,D.P. et al. (2012) Characterisation of 
full-length cDNA sequences provides insights into the Eimeria 
tenella transcriptome. BMC Genomics, 13, 21. 

16. Durham, A.M., Kashiwabara,A.Y., Matsunaga,F.T. et al. (2005) 
EGene: a configurable pipeline generation system for automated 
sequence analysis. Bioinformatics, 21, 2812-2813. 

17. Altschul,S.F., MaddenJ.L, Schaffer,A.A. et al. (1997) Gapped BLAST 
and PSI-BLAST: a new generation of protein database search pro- 
grams. Nucleic Acids Res., 25, 3389-3402. 

18. Marchler-Bauer,A., Lu,S., AndersonJ.B. et al. (2011) CDD: a con- 
served domain database for the functional annotation of proteins. 
Nucleic Acids Res., 39, D225-D229. 

19. Mulder,N. and Apweiler,R. (2007) InterPro and InterProScan: tools 
for protein sequence classification and comparison. Methods Mol. 
Biol., 396, 59-70. 

20. Kall,L, Krogh,A. and Sonnhammer,E.L. (2004) A combined trans- 
membrane topology and signal peptide prediction method. 
J. Mol. Biol., 338, 1027-1036. 

21. Ashburner,M., Ball,C.A., BlakeJ.A. eta/. (2000) Gene ontology: tool 
for the unification of biology. The Gene Ontology Consortium. Nat. 
Genet, 25, 25-29. 

22. Tatusov,R.L, Fedorova,N.D., JacksonJ.D. et al. (2003) The COG 
database: an updated version includes eukaryotes. BMC 
Bioinformatics, 4, 41. 

23. MullerJ., Szklarczyk,D., Julien,P. et al. (2010) eggNOG v2.0: extend- 
ing the evolutionary genealogy of genes with enhanced 
non-supervised orthologous groups, species and functional annota- 
tions. Nucleic Acids Res., 38, D190-D195. 

24. Kanehisa,M., Goto,S., Sato,Y. et al. (2012) KEGG for integration and 
interpretation of large-scale molecular data sets. Nucleic Acids Res., 
40, D109-D114. 

25. Ostlund,G., SchmittJ., Forslund,K. et al. (2010) InParanoid 7: new 
algorithms and tools for eukaryotic orthology analysis. Nucleic 
Acids Res., 38, D196-D203. 



Page 7 of 8 



Database Tool 



Database, Vol. 2013, Article ID bat006, doi:10.1093/database/bat006 



26. Remm,M., Storm,C.E. and Sonnhammer,E.L. (2001) Automatic clus- 
tering of orthologs and in-paralogs from pairwise species compari- 
sons. J. Mol. Biol., 314, 1041-1052. 

27. Stein,L.D., Mungall,C., Shu,S. et al. (2002) The generic genome 
browser: a building block for a model organism system database. 
Genome Res., 12, 1599-610. 

28. Rutherford,K., ParkhillJ., Crook,J. et al. (2000) Artemis: sequence 
visualization and annotation. Bioinformatics, 16, 944-945. 



29. Audic,S. and ClaverieJ.M. (1997) The significance of digital gene 
expression profiles. Genome Res., 7, 986-995. 

30. Carbon,S., lreland,A., Mungall,C.J. et al. (2009) AmiGO: online access 
to ontology and annotation data. Bioinformatics, 25, 288-289. 

31. Mungall,C.J., Emmert,D.B. and FlyBase,C. (2007) A Chado case 
study: an ontology-based modular schema for representing 
genome-associated biological information. Bioinformatics, 23, 
i337-i346. 



Page 8 of 8 



